---
title: "Storing Embeddings in A Vector Database | RAG"
description: Guide on vector storage options in Mastra, including embedded and dedicated vector databases for similarity search.
---

import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";

# Storing Embeddings in A Vector Database

After generating embeddings, you need to store them in a database that supports vector similarity search. Mastra provides a consistent interface for storing and querying embeddings across various vector databases.

## Supported Databases

<Tabs>
  <TabItem value="mongodb" label="MongoDB">

```ts title="vector-store.ts" showLineNumbers copy
import { MongoDBVector } from "@mastra/mongodb";

const store = new MongoDBVector({
  uri: process.env.MONGODB_URI,
  dbName: process.env.MONGODB_DATABASE,
});
await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});
await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

### Using MongoDB Atlas Vector search

For detailed setup instructions and best practices, see the [official MongoDB Atlas Vector Search documentation](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/?utm_campaign=devrel&utm_source=third-party-content&utm_medium=cta&utm_content=mastra-docs).

  </TabItem>

  <TabItem value="pg-vector" label="PgVector">

```ts title="vector-store.ts" showLineNumbers copy
import { PgVector } from "@mastra/pg";

const store = new PgVector({
  id: 'pg-vector',
  connectionString: process.env.POSTGRES_CONNECTION_STRING,
});

await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});

await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

### Using PostgreSQL with pgvector

PostgreSQL with the pgvector extension is a good solution for teams already using PostgreSQL who want to minimize infrastructure complexity.
For detailed setup instructions and best practices, see the [official pgvector repository](https://github.com/pgvector/pgvector).

  </TabItem>

  <TabItem value="pinecone" label="Pinecone">

```ts title="vector-store.ts" showLineNumbers copy
import { PineconeVector } from "@mastra/pinecone";

const store = new PineconeVector({
  id: 'pinecone-vector',
  apiKey: process.env.PINECONE_API_KEY,
});
await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});
await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="qdrant" label="Qdrant">

```ts title="vector-store.ts" showLineNumbers copy
import { QdrantVector } from "@mastra/qdrant";

const store = new QdrantVector({
  id: 'qdrant-vector',
  url: process.env.QDRANT_URL,
  apiKey: process.env.QDRANT_API_KEY,
});

await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});

await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="chroma" label="Chroma">

```ts title="vector-store.ts" showLineNumbers copy
import { ChromaVector } from "@mastra/chroma";

// Running Chroma locally
// const store = new ChromaVector()

// Running on Chroma Cloud
const store = new ChromaVector({
  id: 'chroma-vector',
  apiKey: process.env.CHROMA_API_KEY,
  tenant: process.env.CHROMA_TENANT,
  database: process.env.CHROMA_DATABASE,
});

await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});

await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="astra" label="Astra">

```ts title="vector-store.ts" showLineNumbers copy
import { AstraVector } from "@mastra/astra";

const store = new AstraVector({
  token: process.env.ASTRA_DB_TOKEN,
  endpoint: process.env.ASTRA_DB_ENDPOINT,
  keyspace: process.env.ASTRA_DB_KEYSPACE,
});

await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});

await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="libsql" label="LibSQL">

```ts title="vector-store.ts" showLineNumbers copy
import { LibSQLVector } from "@mastra/core/vector/libsql";

const store = new LibSQLVector({
  id: 'libsql-vector',
  connectionUrl: process.env.DATABASE_URL,
  authToken: process.env.DATABASE_AUTH_TOKEN, // Optional: for Turso cloud databases
});

await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});

await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="upstash" label="Upstash">

```ts title="vector-store.ts" showLineNumbers copy
import { UpstashVector } from "@mastra/upstash";

// In upstash they refer to the store as an index
const store = new UpstashVector({
  id: 'upstash-vector',
  url: process.env.UPSTASH_URL,
  token: process.env.UPSTASH_TOKEN,
});

// There is no store.createIndex call here, Upstash creates indexes (known as namespaces in Upstash) automatically
// when you upsert if that namespace does not exist yet.
await store.upsert({
  indexName: "myCollection", // the namespace name in Upstash
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="cloudflare" label="Cloudflare">

```ts title="vector-store.ts" showLineNumbers copy
import { CloudflareVector } from "@mastra/vectorize";

const store = new CloudflareVector({
  accountId: process.env.CF_ACCOUNT_ID,
  apiToken: process.env.CF_API_TOKEN,
});
await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});
await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="opensearch" label="OpenSearch">

```ts title="vector-store.ts" showLineNumbers copy
import { OpenSearchVector } from "@mastra/opensearch";

const store = new OpenSearchVector({ url: process.env.OPENSEARCH_URL });

await store.createIndex({
  indexName: "my-collection",
  dimension: 1536,
});

await store.upsert({
  indexName: "my-collection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

  <TabItem value="elasticsearch" label="ElasticSearch">

```ts title="vector-store.ts" showLineNumbers copy
import { ElasticSearchVector } from "@mastra/elasticsearch";

const store = new ElasticSearchVector({ url: process.env.ELASTICSEARCH_URL });

await store.createIndex({
  indexName: "my-collection",
  dimension: 1536,
});

await store.upsert({
  indexName: "my-collection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>
  <TabItem value="couchbase" label="Couchbase">

```ts title="vector-store.ts" showLineNumbers copy
import { CouchbaseVector } from "@mastra/couchbase";

const store = new CouchbaseVector({
  connectionString: process.env.COUCHBASE_CONNECTION_STRING,
  username: process.env.COUCHBASE_USERNAME,
  password: process.env.COUCHBASE_PASSWORD,
  bucketName: process.env.COUCHBASE_BUCKET,
  scopeName: process.env.COUCHBASE_SCOPE,
  collectionName: process.env.COUCHBASE_COLLECTION,
});
await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});
await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>
  <TabItem value="lancedb" label="Lance">

```ts title="vector-store.ts" showLineNumbers copy
import { LanceVectorStore } from "@mastra/lance";

const store = await LanceVectorStore.create("/path/to/db");

await store.createIndex({
  tableName: "myVectors",
  indexName: "myCollection",
  dimension: 1536,
});

await store.upsert({
  tableName: "myVectors",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

### Using LanceDB

LanceDB is an embedded vector database built on the Lance columnar format, suitable for local development or cloud deployment.
For detailed setup instructions and best practices, see the [official LanceDB documentation](https://lancedb.github.io/lancedb/).

  </TabItem>
  <TabItem value="s3vectors" label="S3 Vectors">

```ts title="vector-store.ts" showLineNumbers copy
import { S3Vectors } from "@mastra/s3vectors";

const store = new S3Vectors({
  vectorBucketName: "my-vector-bucket",
  clientConfig: {
    region: "us-east-1",
  },
  nonFilterableMetadataKeys: ["content"],
});

await store.createIndex({
  indexName: "my-index",
  dimension: 1536,
});
await store.upsert({
  indexName: "my-index",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({ text: chunk.text })),
});
```

  </TabItem>

</Tabs>

## Using Vector Storage

Once initialized, all vector stores share the same interface for creating indexes, upserting embeddings, and querying.

### Creating Indexes

Before storing embeddings, you need to create an index with the appropriate dimension size for your embedding model:

```ts title="store-embeddings.ts" showLineNumbers copy
// Create an index with dimension 1536 (for text-embedding-3-small)
await store.createIndex({
  indexName: "myCollection",
  dimension: 1536,
});
```

The dimension size must match the output dimension of your chosen embedding model. Common dimension sizes are:

- OpenAI text-embedding-3-small: 1536 dimensions (or custom, e.g., 256)
- Cohere embed-multilingual-v3: 1024 dimensions
- Google `text-embedding-004`: 768 dimensions (or custom)

> **Important**: Index dimensions cannot be changed after creation. To use a different model, delete and recreate the index with the new dimension size.

### Naming Rules for Databases

Each vector database enforces specific naming conventions for indexes and collections to ensure compatibility and prevent conflicts.

<Tabs>
  <TabItem value="mongodb" label="MongoDB">
    Collection (index) names must:
    - Start with a letter or underscore
    - Be up to 120 bytes long
    - Contain only letters, numbers, underscores, or dots
    - Cannot contain `$` or the null character
    - Example: `my_collection.123` is valid
    - Example: `my-index` is not valid (contains hyphen)
    - Example: `My$Collection` is not valid (contains `$`)
  </TabItem>
  <TabItem value="pgVector" label="PgVector">
    Index names must:
    - Start with a letter or underscore
    - Contain only letters, numbers, and underscores
    - Example: `my_index_123` is valid
    - Example: `my-index` is not valid (contains hyphen)
  </TabItem>
  <TabItem value="pinecone" label="Pinecone">
    Index names must:
    - Use only lowercase letters, numbers, and dashes
    - Not contain dots (used for DNS routing)
    - Not use non-Latin characters or emojis
    - Have a combined length (with project ID) under 52 characters
      - Example: `my-index-123` is valid
      - Example: `my.index` is not valid (contains dot)
  </TabItem>
  <TabItem value="qdrant" label="Qdrant">
    Collection names must:
    - Be 1-255 characters long
    - Not contain any of these special characters:
      - `< > : " / \ | ? *`
      - Null character (`\0`)
      - Unit separator (`\u{1F}`)
    - Example: `my_collection_123` is valid
    - Example: `my/collection` is not valid (contains slash)
  </TabItem>
  <TabItem value="chroma" label="Chroma">
    Collection names must:
    - Be 3-63 characters long
    - Start and end with a letter or number
    - Contain only letters, numbers, underscores, or hyphens
    - Not contain consecutive periods (..)
    - Not be a valid IPv4 address
    - Example: `my-collection-123` is valid
    - Example: `my..collection` is not valid (consecutive periods)
  </TabItem>
  <TabItem value="astra" label="Astra">
    Collection names must:
    - Not be empty
    - Be 48 characters or less
    - Contain only letters, numbers, and underscores
    - Example: `my_collection_123` is valid
    - Example: `my-collection` is not valid (contains hyphen)
  </TabItem>
  <TabItem value="libsql" label="LibSQL">
    Index names must:
    - Start with a letter or underscore
    - Contain only letters, numbers, and underscores
    - Example: `my_index_123` is valid
    - Example: `my-index` is not valid (contains hyphen)
  </TabItem>
  <TabItem value="upstash" label="Upstash">
    Namespace names must:
    - Be 2-100 characters long
    - Contain only:
      - Alphanumeric characters (a-z, A-Z, 0-9)
      - Underscores, hyphens, dots
    - Not start or end with special characters (_, -, .)
    - Can be case-sensitive
    - Example: `MyNamespace123` is valid
    - Example: `_namespace` is not valid (starts with underscore)
  </TabItem>
  <TabItem value="cloudflare" label="Cloudflare">
    Index names must:
    - Start with a letter
    - Be shorter than 32 characters
    - Contain only lowercase ASCII letters, numbers, and dashes
    - Use dashes instead of spaces
    - Example: `my-index-123` is valid
    - Example: `My_Index` is not valid (uppercase and underscore)
  </TabItem>
  <TabItem value="opensearch" label="OpenSearch">
    Index names must:
    - Use only lowercase letters
    - Not begin with underscores or hyphens
    - Not contain spaces, commas
    - Not contain special characters (e.g. `:`, `"`, `*`, `+`, `/`, `\`, `|`, `?`, `#`, `>`, `<`)
    - Example: `my-index-123` is valid
    - Example: `My_Index` is not valid (contains uppercase letters)
    - Example: `_myindex` is not valid (begins with underscore)
  </TabItem>
  <TabItem value="elasticsearch" label="ElasticSearch">
    Index names must:
    - Use only lowercase letters
    - Not exceed 255 bytes (counting multi-byte characters)
    - Not begin with underscores, hyphens, or plus signs
    - Not contain spaces, commas
    - Not contain special characters (e.g. `:`, `"`, `*`, `+`, `/`, `\`, `|`, `?`, `#`, `>`, `<`)
    - Not be "." or ".."
    - Not start with "." (deprecated except for system/hidden indices)
    - Example: `my-index-123` is valid
    - Example: `My_Index` is not valid (contains uppercase letters)
    - Example: `_myindex` is not valid (begins with underscore)
    - Example: `.myindex` is not valid (begins with dot, deprecated)
  </TabItem>
  <TabItem value="s3vectors" label="S3 Vectors">
    Index names must:
    - Be unique within the same vector bucket
    - Be 3–63 characters long
    - Use only lowercase letters (`a–z`), numbers (`0–9`), hyphens (`-`), and dots (`.`)
    - Begin and end with a letter or number
    - Example: `my-index.123` is valid
    - Example: `my_index` is not valid (contains underscore)
    - Example: `-myindex` is not valid (begins with hyphen)
    - Example: `myindex-` is not valid (ends with hyphen)
    - Example: `MyIndex` is not valid (contains uppercase letters)
  </TabItem>
</Tabs>

### Upserting Embeddings

After creating an index, you can store embeddings along with their basic metadata:

```ts title="store-embeddings.ts" showLineNumbers copy
// Store embeddings with their corresponding metadata
await store.upsert({
  indexName: "myCollection", // index name
  vectors: embeddings, // array of embedding vectors
  metadata: chunks.map((chunk) => ({
    text: chunk.text, // The original text content
    id: chunk.id, // Optional unique identifier
  })),
});
```

The upsert operation:

- Takes an array of embedding vectors and their corresponding metadata
- Updates existing vectors if they share the same ID
- Creates new vectors if they don't exist
- Automatically handles batching for large datasets

## Adding Metadata

Vector stores support rich metadata (any JSON-serializable fields) for filtering and organization. Since metadata is stored with no fixed schema, use consistent field naming to avoid unexpected query results.

**Important**: Metadata is crucial for vector storage - without it, you'd only have numerical embeddings with no way to return the original text or filter results. Always store at least the source text as metadata.

```ts showLineNumbers copy
// Store embeddings with rich metadata for better organization and filtering
await store.upsert({
  indexName: "myCollection",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({
    // Basic content
    text: chunk.text,
    id: chunk.id,

    // Document organization
    source: chunk.source,
    category: chunk.category,

    // Temporal metadata
    createdAt: new Date().toISOString(),
    version: "1.0",

    // Custom fields
    language: chunk.language,
    author: chunk.author,
    confidenceScore: chunk.score,
  })),
});
```

Key metadata considerations:

- Be strict with field naming - inconsistencies like 'category' vs 'Category' will affect queries
- Only include fields you plan to filter or sort by - extra fields add overhead
- Add timestamps (e.g., 'createdAt', 'lastUpdated') to track content freshness

## Deleting Vectors

When building RAG applications, you often need to clean up stale vectors when documents are deleted or updated. Mastra provides the `deleteVectors` method that supports deleting vectors by metadata filters, making it easy to remove all embeddings associated with a specific document.

### Delete by Metadata Filter

The most common use case is deleting all vectors for a specific document when a user deletes it:

```ts title="delete-vectors.ts" showLineNumbers copy
// Delete all vectors for a specific document
await store.deleteVectors({
  indexName: "myCollection",
  filter: { docId: "document-123" },
});
```

This is particularly useful when:
- A user deletes a document and you need to remove all its chunks
- You're re-indexing a document and want to remove old vectors first
- You need to clean up vectors for a specific user or tenant

### Delete Multiple Documents

You can also use complex filters to delete vectors matching multiple conditions:

```ts title="delete-vectors-advanced.ts" showLineNumbers copy
// Delete all vectors for multiple documents
await store.deleteVectors({
  indexName: "myCollection",
  filter: {
    docId: { $in: ["doc-1", "doc-2", "doc-3"] },
  },
});

// Delete vectors for a specific user's documents
await store.deleteVectors({
  indexName: "myCollection",
  filter: {
    $and: [
      { userId: "user-123" },
      { status: "archived" },
    ],
  },
});
```

### Delete by Vector IDs

If you have specific vector IDs to delete, you can pass them directly:

```ts title="delete-by-ids.ts" showLineNumbers copy
// Delete specific vectors by their IDs
await store.deleteVectors({
  indexName: "myCollection",
  ids: ["vec-1", "vec-2", "vec-3"],
});
```

## Best Practices

- Create indexes before bulk insertions
- Use batch operations for large insertions (the upsert method handles batching automatically)
- Only store metadata you'll query against
- Match embedding dimensions to your model (e.g., 1536 for `text-embedding-3-small`)
